Review: G. De Leve, Generalized Markovian Decision Processes-Parts I and II

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Learning in Markovian Decision Processes

متن کامل

Non-Deterministic Policies in Markovian Decision Processes

Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making problems in such environments. In recent years, attempts were made to apply methods from reinforcement learning to construct decision support systems for action selection in Markovian environments. Although conventional meth...

متن کامل

Small Animal Anesthesia Parts I and Ii

Anticholinergics Anticholinergics are competitive antagonists of acetylcholine at post ganglionic parasympathetic muscurinic receptors. Stimulation of muscurinic receptors induces salivation, pupillary constriction, bronchoconstriction, gastric acid secretion, gastrointestinal motility, and slowing of the heart rate. The two most commonly used anticholinergics in veterinary medicine are atropin...

متن کامل

Constrained Markovian decision processes: the dynamic programming approach

We consider semicontinuous controlled Markov models in discrete time with total expected losses. Only control strategies which meet a set of given constraint inequalities are admissible. One has to build an optimal admissible strategy. The main result consists in the constructive development of optimal strategy with the help of the dynamic programming method. The model studied covers the case o...

متن کامل

Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes

Reinforcement learning (RL) has become a central paradigm for solving learning-control problems in robotics and artificial intelligence. R L researchers have focussed almost exclusively on problems where the controller has to maximize the discounted sum of payoffs. However, as emphasized by Schwartz (1$X)3), in many problems, e.g., those for which the optimal behavior is a limit cycle, it is mo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The Annals of Mathematical Statistics

سال: 1966

ISSN: 0003-4851

DOI: 10.1214/aoms/1177699184